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Chronic periodontitis (CP) is a common oral disease that confers substantial systemic inflammatory and 
microbial burden and is a major cause of tooth loss. Here, we present the results of a genome-wide association 
study of CP that was carried out in a cohort of 4504 European Americans (EA) participating in the 
Atherosclerosis Risk in Communities (ARIC) Study (mean age — 62 years, moderate CP — 43% and severe 
CP — 17%). We detected no genome-wide significant association signals for CP; however, we found suggestive 
evidence of association (P< 5 x 10 -6 ) for six loci, including NIN, NPY, WNT5A for severe CP and NCR2, EMR1, 
10p15 for moderate CP. Three of these loci had concordant effect size and direction in an independent sample 
of 656 adult EA participants of the Health, Aging, and Body Composition (Health ABC) Study. Meta-analysis 
pooled estimates were severe CP (n = 958 versus health: n = 1909)— NPY, rs2521634 [G]: odds ratio [OR = 
1.49 (95% confidence interval (CI = 1.28-1.73, P = 3.5 x 10~ 7 ))]; moderate CP (n = 2293)— NCR2, rs7762544 
[G]: OR = 1.40 (95% CI = 1.24-1.59, P = 7.5 x 10 -8 ), EMR1, rs3826782 [A]: OR = 2.01 (95% CI = 1.52-2.65, 
P = 8.2 x 10 7 ). Canonical pathway analysis indicated significant enrichment of nervous system signaling, 
cellular immune response and cytokine signaling pathways. A significant interaction of NUAK1 (rs1 11 12872, 
interaction P = 2.9 x 10 9 ) with smoking in ARIC was not replicated in Health ABC, although estimates of her- 
itable variance in severe CP explained by all single nucleotide polymorphisms increased from 18 to 52% with 
the inclusion of a genome-wide interaction term with smoking. These genome-wide association results provide 
information on multiple candidate regions and pathways for interrogation in future genetic studies of CP. 



INTRODUCTION response to commensal and pathogenic oral bacteria (1) and 

is found in about 20% of the adult US population. It manifests 
Chronic periodontitis (CP) is a common complex disease of with gingival pocket formation and clinical attachment loss 
the oral cavity that is characterized by an inflammatory (CAL) and results in gradual destruction of periodontal 
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tissues and tooth-supporting alveolar bone. CP is considered 
the main cause of tooth loss among adults and is associated 
with severe quality of life impacts (2). Moreover, a growing 
body of evidence links the disease with increased risk of sys- 
temic conditions, including coronary heart disease (CHD) (3), 
pregnancy outcomes (4), poor diabetes mellitus (DM) control 
(5) and others. Although definitive mechanistic evidence is 
lacking, it has been suggested that these links may be mediated 
to some degree via both the microbial component and the 
inflammatory burden that characterize the disease (6-8). 

Risk factors for CP have been well studied and include 
smoking and DM. In addition, age, race and obesity have also 
been shown to be important risk indicators (9). A genetic com- 
ponent of CP risk was supported by early reports of familial ag- 
gregation of severe forms of the disease (10), as well as twin 
studies (11), but the magnitude of risk conferred by genetics 
and the role of specific genes has been under debate. Recent can- 
didate gene studies for CP have focused on genes related to host 
immunity and inflammatory response such as cytokines, cell- 
surface receptors, chemokines, enzymes and antigen recogni- 
tion. Most of these studies have examined polymorphisms in 
the interleukin (IL)-l, 1L-6, Fc gamma receptor (FCGR2A), 
tumor necrosis factor (TNF) alpha, human vitamin D receptor 
(VDR), cluster of differentiation {CD)-\A, matrix metallo- 
proteinase-1, toll-like receptor (TLR), cyclo-oxygenase-2 
(COX-2) and C-reactive protein genes with mixed results 
(12,13). A recent meta-analysis of studies investigating the 
association between CP and IL1 polymorphisms reported sig- 
nificantly elevated summary estimates for two ILIA polymorph- 
isms [rsl800587 and rsl7561, in tight linkage disequilibrium 
(LD) — r 2 = 1.0; odds ratios (OR =1.48; 95% confidence 
interval (CI = 1.17-1.86))] and one IL1B polymorphism 
(rsll43634; OR= 1.54; 95% CI = 1.03-2.30); the authors, 
however, noted an underlying heterogeneity in the published 
estimates (14). Whereas most of what is known or hypothesized 
with regard to genetic risk loci for CP has been based on candi- 
date gene studies, to date no genome-wide association study 
(GWAS) has been performed. To add to the knowledge base 
of the genetic etiology of CP, this study aimed to investigate 
genetic risk loci for CP using a genome-wide association 
(GWA) approach in the context of a well-defined cohort. 

RESULTS 

GWA analysis results in the Atherosclerosis Risk 
in Communities (ARIC) cohort 

The Atherosclerosis Risk in Communities (ARIC) participants 
had a mean age of 62 years, with a balanced sex distribution. 
Twelve percent were current smokers, and 11% had DM. 
In terms of Centers of Disease Control [CDC/American 
Academy of Periodontology (AAP)] periodontal diagnoses, 
17% (n = 761) were classified as severe, 43% (n = 1920) as 
moderate and 40% (n = 1823) as healthy. The descriptive char- 
acteristics of the European American Dental ARIC cohort par- 
ticipants that were included in this analysis are presented in the 
Supplementary Material, Table SI. After exclusions that are 
described in the Materials and Methods section, there were 2 
135 236 single nucleotide polymorphisms (SNPs) included in 
the GWA analysis. Genomic inflation factors were low, 1.019 
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Figure 1. Quantile-quantile plots of genome-wide association analysis results 
of severe CP (A) and moderate CP (B) complex in the ARIC cohort. 

for moderate and 1.024 for severe CP (Fig. 1). No genome-wide 
significant signals were noted; however, 26 SNPs in 6 loci (3 for 
moderate and 3 for severe CP) emerged below a P < 5 x 10 6 
threshold (Fig. 2). The SNP with the lowest P-value in each of 
these six loci was, thus, prioritized for further annotation and 
follow-up (Table 1). The quality scores and proxies of these 
prioritized SNPs are presented in the Supplementary Material, 
Table S2. Adjustment for smoking and diabetes did not result 
in any important change in estimate (less than 10%) of associ- 
ation for these SNPs (Supplementary Material, Table S3). Simi- 
larly, sex-stratified analyses did not reveal any departures from 
homogeneity (Supplementary Material, Table S4). Consistent 
with the current trends in reporting of GWAS (15), we have 
made available the results of the entire set of SNPs that we 
investigated in ascending P-value order at: http: //genome wide. 
net/public/aric/dental/periodontitis/CDC/cdcl vs0_full.txt and 
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Figure 2. Manhattan plots of genome-wide association analysis results of severe CP (A) and moderate CP (B) complex in the ARIC cohort. 



http :// genomewide.net/public/aric/ dental/periodontitis/ CDC/cdc 
2vs0_full.txt. 

Visualizations of the six loci of interest, along with nearby 
genes and recombination rates, are presented in Figure 3. For 
severe CP, the strongest (with respect to P-value) association 
in the 14q21 locus was produced by rsl2883458. This SNP is 
located in an intron of Ninein (GSK3B interacting protein; 
NIN). The minor allele [C] showed a 3.5% enrichment among 
severe CP patients and was associated with an OR= 1.89 
(95% CI= 1.48-2.41; P=3.5 x 10" 7 ). In the 7pl5 locus, 
the major allele [G] of an intergenic SNP, rs2521634 [minor 
allele frequency (MAF) [A] = 0.25; 47 Kb downstream from 
Neuropeptide Y (NPY)], produced an OR = 1.47 (95% CI = 



1.25-1.73; P=l.6x 10" 6 ). Similarly, the major allele [G] 
of rsl 1925054 (MAF [T] = 0.14) in the 3p21 locus, down- 
stream from wingless-type mouse mammary tumor virus inte- 
gration site family, member 5 A (WNT5A) and ELKS/ 
RAB6-interacting/CAST family member 2 (ERC2), produced 
the strongest signal in the region (OR = 1.69; 95% CI = 
1.37-2.10; P = 6.5 x 10~ ), showing 4% enrichment among 
severe CP patients. 

With regard to moderate CP, the SNP with the lowest 
P- value in the 6p21.1 locus was rs7762544 (MAF [G] = 
0.18; OR =1.41; 95% CI = 1.24-1.60; P = 1.1 x 10" 7 ). 
This variant is 61 Kb downstream from natural cytotoxicity 
triggering receptor 2 (NCR2), and its minor (risk) allele 
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showed 4.4% enrichment among moderate CP patients when 
compared with healthy participants. SNP rs3 826782 (MAF 
[A] = 0.07), located in an intronic region of Egf-like module 
containing, mucin-like, hormone receptor-like l(EMRl) 
and 30 Kb downstream from Vav 1 guanine nucleotide 
exchange factor (VAV I), provided the strongest signal in that 
locus (OR = 2.00; 95% CI = 1.48-2.70; P = 4.0 x 10" 6 ), 
showing a small 1% enrichment among ARIC 'cases'. Exam- 
ination of the SCAN database for expression-associated data 
(Supplementary Material, Table S5) revealed that this SNP 
has been associated (P = 2 x 10~ 5 ) with the expression of 
GPR113 on lymphoblastoid cell lines (LCLs). GPR113 has 
been found to be expressed intraorally, by taste receptor 
cells (16), and has functions on the neuropeptide signaling 
pathway [Gene Ontology (GO) accession: 0007218] and 
signal transducer activity (GO accession: 0004871) — path- 
ways relevant to NPY. Finally, the major allele [G] of 
rsl2260727 (MAF [A] = 0.15) in the 10pl5 locus was asso- 
ciated with OR = 1.54 (95% CI = 1.30-1.82; P=6.0x 
10" 7 ). SNP rsl2260727 is located distant (700 Kb upstream) 
from the closest gene CUGBP, Elav-like family member 2 
(CELF2). As expected, all SNPs in LD (r 2 > 0.80) displayed 
directionally consistent and of a similar magnitude effect 
estimates as the index SNP at each locus. 

Examination of the prioritized SNPs in the subset of 1020 
European American Dental ARIC participants previously 
reported in a GWAS of periodontal pathogen colonization 
by Divaris et al. (17) showed no important association with 
high colonization with periodontal pathogens of the 'red' or 
'orange' complex, or individual bacteria such as Aggregati- 
bacter actinomycetemcomitans and Porphyromonas gingivalis 
(Supplementary Material, Table S6). The number of concord- 
ant effect estimates (n = 13) that we detected in this additional 
'look-up' was not statistically different from what would be 
expected by chance alone, among 24 examined effect esti- 
mates (binomial test P = 0.5). However, NPY and NCR2 
were two loci that showed concordant effect estimates with 
all four bacterial traits. Examination of additional polymorph- 
isms in the CD14, FCGR2A, ILIA, IL1B, IL1RN, IL4, IL6, 
IL10, TLR4, TNF and VDR genes that have previously been 
reported (13) as associated with CP in at least one candidate 
gene study among whites of European descent did not show 
any evidence of association (Table 2). 

Replication and meta-analysis in the Health, Aging 
and Body Composition (Health ABC) Study 

Three SNPs associated with severe CP (rsl2883458, 
rs2521634 and rsl 1925054) and three SNPs associated 
with moderate CP (rs7762544, rs3826782, rsl2260727) in 
the ARIC Study (P < 5 x 10~ 6 ) were investigated using the 
next largest cohort available to us, the Heath ABC Study 
(n = 686). The descriptive characteristics of the Health, 
Aging, and Body Composition (Health ABC) Study partici- 
pants included in this analysis are presented in Supplementary 
Material, Table SI. 

Of the three SNPs investigated with severe CP, one SNP 
(NPY locus: rs2521634) was nominally associated with 
severe CP in the Health ABC Study (OR = 1.64; 95% CI = 
1.01-2.65; P= 0.046), with a concordant effect direction 
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Figure 3. Visualization of the six loci that were prioritized based on a P < 5 x 10 criterion in the ARIC cohort. Three loci were prioritized based on their 
association with severe CP (A: 14q21, NIN, rsl2883458; B: 7pl5, NPY, rs2521634; C: 3p21, WNT5A/ERC2, rsl 1925054) and three for their association with 
moderate CP (D: 6p2 1.1, NCR2, rs7762544; E: 19pl3.3, EMR1, rs3826782; F: 10pl5, rsl2260727). The vertical axis corresponds to each locus' (index) 
SNP — loglO P-value; the SNP associated with the lowest P-value is labeled. The overlaid recombination rate plot and the color-coded pairwise linkage disequi- 
librium values with index SNPs were calculated based on HapMap II-CEU (human genome 18, build 36). 
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Table 2. P-values for the association with severe and moderate CP in the Den- 
tal ARIC GWAS of polymorphisms that have been examined and reported as 
associated in at least one study among whites of European descent in the review 
of Laine et al. (13) 



ARIC GWAS P b 



Gene 


SNP 


Position 
(build 36) 


MAF a 

(HapMap 

II-CEU) 


Severe 
CP 


Moderate 
CP 


CD14 


rs2569190 


139993100 


[A] 0.47 


0.57 


0.75 


FCGR2A 


rs 180 1274 


159746369 


[A] 0.50 


0.66 


0.37 


II 10 
IL,1 U 


re 1 800879 

rs i ouuo / z 


ZUJUljUju 


m o ii 

[ 1 J u.zz 


O RQ 

U.67 


O 

U.JO 


ILIA 


rsl800587 


113259431 


[A] 0.31 


0.20 


0.51 




rs 17561 


113253694 


[A] 0.31 


0.20 


0.50 


IL1B 


rsl 143634 


113306861 


[A] 0.26 


0.21 


0.27 


1L1RN 


rs419598 


113603678 


[C] 0.27 


0.31 


0.97 


IL4 


rs2070874 


132037609 


[T] 0.15 


0.17 


0.62 




rs2243250 


132037053 


[T] 0.15 


0.17 


0.63 


IL6 


rsl 800795 


22733170 


[C] 0.43 


0.87 


0.18 




rs2069827 


22731981 


[T] 0.10 


0.31 


0.49 


MMP1 


rs475007 


102174522 


[T] 0.44 


0.90 


0.12 


TLR4 


rs4986791 


119515423 


[T] 0.06 


0.93 


0.27 




rs4986790 


119515123 


[G] 0.06 


0.93 


0.27 


TNF 


rsl 800629 


31651010 


[A] 0.17 


0.68 


0.77 


VDR 


rs731236 


46525024 


[G] 0.41 


0.69 


0.59 




rsl 5444 10 


46526102 


[T] 0.41 


0.68 


0.59 



a Minor allele frequency. 

b Based on logistic regression, log-additive models, including terms for age, sex, 
study center and ancestry (10 first PCs). 



and magnitude as in ARIC (Table 3). Similarly, the EMR1 
locus (rs3826782) showed concordant effect direction and 
magnitude (OR = 2.06; 95% CI = 1.03-4.60; P=0.06). 
The NCR2 (rs7762544) locus showed similar effect direction 
for moderate CP risk in Health ABC (OR = 1.32, 95% CI = 
0.81-2.14; P = 0.27). None of these SNPs met statistical sig- 
nificance criteria based on a Bonferroni-corrected (assuming 
a = 0.05 and six independent tests) P-value criterion of 
0.0083. No significant effect was observed with rsl2260727 
at the 10pl5 locus (OR =0.93; 95% CI = 0.56-1.57; P = 
0.8). The remaining three loci showed non-significant effects 
of opposite direction in Health ABC. Allele frequencies of 
these SNPs in the Health ABC sample are presented in the 
Supplementary Material, Table S7. 

The three SNPs (rs2521634, rs7762544 and rs3826782) 
that showed concordant effect direction and magnitude in 
the discovery and replication cohorts were carried forward for 
meta-analysis, resulting in the following summary estimates: 
severe CP — NPY, rs2521634 [G]: OR pooled = 1.49 (95% CI = 
1.28-1.73; P= 3.5 x 10" 7 ); moderate CP — NCR2, 
rs7762544 [G]: OR pooled = 1.40 (95% CI = 1.24-1.59; P = 
7.5 x 10" 8 ), EMR1, rs3826782 [A]: OR poo i e d = 2.01 (95% 
CI = 1.52-2.65; P = 8.2 x 10" 7 ). 

Genome-wide interactions with smoking 

The exploratory genome-wide interaction analysis between all 
SNPs and smoking history (ever versus never smoker) 
revealed one genome-wide significant locus for moderate CP 
at NUAK1 (12q23.3), where the intronic rsl 11 12872 (MAF 
[G] = 0.42) produced an interaction term P = 2.9 x 10~ 9 . 



Smoking-stratified estimates of moderate CP for the minor 
allele were never smokers («= 1825) — OR = 0.77 (95% 
CI = 0.67-0.88; P = 2.2x 10" 4 ) and ever smokers (n = 
1838)— OR = 1.42 (95% CI = 1.23-1.64; P = 2.8x 10" 6 ). 
This interaction for moderate CP was not replicated in 
Health ABC (never smokers — n = 227, ever smokers — n = 
232 in the moderate CP versus health analysis), where the 
stratified estimates were of discordant direction when com- 
pared with ARIC. All other interaction estimates for moderate 
and severe CP were of substantially smaller magnitude 
(P- values around 10 6 or greater) and are available as 
Supplementary Material at http://genomewide.net/public/ 
aric/dental/periodontitis/CDC/cdclvsO_smoke.txt and http:// 
genomewide.net/public/aric/dental/periodontitis/CDC/cdc2vsO_ 
smoke.txt. 

Estimation of variance explained by all SNPs 

The results of the examination of heritable variance explained 
by all GWAS SNPs are presented in Table 4. Variance 
explained (h 2 ) estimates were consistently higher for severe 
CP when compared with moderate CP. Some variations in 
the h 2 estimates were also noted with regard to the SNP sets 
that were used for analyses. Interestingly, higher variance in 
moderate disease was explained with the use of imputed 
versus genotyped SNPs, whereas the inverse was true for 
severe CP. After adjustment for population stratification and 
using imputed SNPs, h 2 was 7% for moderate CP and 8% 
for severe CP. Higher phenotypic variance (18%) was 
explained in severe disease with the use of genotyped-only 
SNPs. However, standard errors were large for all these esti- 
mates. Notably, the inclusion of a G x E interaction term 
with smoking history produced high estimates of variance 
jointly explained by SNPs and their interaction with 
smoking, ranging between 36 and 41% for moderate CP and 
42 and 52% for severe CP. 

Canonical signaling pathway analysis 

Of the 147 Ingenuity Pathway Analysis (IP A) canonical sig- 
naling pathways that were tested in the moderate and severe 
CP results data, 11 were significantly enriched (P < 3.4 x 
10~ 4 ) (Supplementary Material, Figs SI and S2). Contrarily, 
none of the metabolic pathways that we tested as controls 
was associated with the data. Two of the identified pathways 
(axonal guidance and neuropathic pain signaling) were 
common for severe and moderate CP, eight (i.e. synaptic long- 
term potentiation, cAMP responsive element binding protein 
signaling in neurons, TV-formyl-methionyl-leucyl-phenylalan- 
ine (fMLP) signaling in neutrophils, neuronal nitric oxide syn- 
thase 1 signaling in neurons and virus entry via endocytic 
pathways) were unique to severe CP and one (amyotrophic 
lateral sclerosis signaling) was unique to moderate CP. 
Overall, neurotransmitters and nervous signaling pathways 
were the majority (9 out of 12) of those enriched. Axonal guid- 
ance signaling was the most significant (severe CP — P = 
2.1 x 10" 7 , ratio = 0.194; moderate CP — P = 1.4 x 10" 5 , 
ratio = 0.173) and included WNT5A, the only prioritized 
locus that was represented in the top pathways. Axonal 
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guidance signaling remained the top canonical pathway for 
both moderate and severe CP when the pathway analysis 
was repeated using a smaller set of loci that met more stringent 
association criteria (P < 10 3 ), but did not reach statistical 
significance. The remaining three identified pathways were 
cellular immune response related, namely fMLP signaling in 
neutrophils, CXCR4 signaling and virus entry via endocytic 
pathways. 

DISCUSSION 

In this work, we present the results of the first genome-wide 
investigation of CP. In our GWAS, we identified six loci pro- 
viding suggestive evidence of association with CP. Although 
no signals met genome-wide significance criteria, we found 
that three of these loci (NPY, NCR2 and EMR1) had concord- 
ant effect direction in a replication sample, with NPY being 
nominally associated with severe CP. Similarly, a genome- 
wide significant interaction of NUAK1 with smoking was not 
replicated. Nevertheless, our estimates of phenotypic heritabil- 
ity explained by all SNPs were significantly increased with 
the inclusion of a G x E interaction term with smoking. Fur- 
thermore, our findings suggest an important role of neurotrans- 
mitter and nervous system signaling pathways in the risk of 
CP, complementary to immune response-related ones. 

Although limited by its sample size of about 5000 subjects, 
this study was conducted using a well-defined cohort with 
detailed characterization of CP, using full-mouth periodontal 
examinations and the latest consensus taxonomy of CP. 
These results provide a wealth of new information on potential 
candidate genes that will require further exploration, replica- 
tion and validation in future studies. Our GWA scan did not 
identify any SNP that met strict genome-wide statistical sig- 
nificance criteria (P < 5 x 10~ 8 ). However, many true 
signals, elements of the 'missing heritability' may be found 
below this threshold (18-22). Therefore, we used a more 
lenient P-value threshold for prioritization and annotation of 
a set of 'promising' SNPs, reported estimates of variance 
explained by all GWAS SNPs and made available the com- 
plete set of our GWAS results. 

The fact that virtually all prioritized loci are related to host 
immunity is consistent with the current understanding of the 
etiology and pathogenesis of periodontal diseases (1,8,23). 
The preponderance of neurotransmitter and nervous system 
signaling pathways among those significantly associated with 
our data echo reports that emphasize the role of the nervous 
system in the pathophysiology of peripheral inflammation 
and suggest a neurogenic inflammatory component for peri- 
odontitis (24). In fact, a recent study found differences in gin- 
gival crevicular fluid levels of NPY between healthy and 
periodontitis-affected sites (25). Nevertheless, CP is a 
complex, polygenic disease, wherein multiple genes are 
likely to confer risk or protection by influencing the host in- 
flammatory response and the qualitative and quantitative com- 
position of the oral microbiome. Noteworthy, two of the six 
prioritized loci (NPY and NCR2) showed concordant effect 
estimates for all examined 'high periodontal pathogen colon- 
ization' traits in a previous GWAS (17), an observation that 
parallels the pathogenic oral microbial shift that is characteris- 
tic of periodontitis (8). 



Human Molecular Genetics, 2013, Vol. 22, No. 11 2319 



Table 4. Phenotypic variance explained for moderate and severe CP by all genotyped and imputed autosomal SNPs available and their interaction with smoking 
history (ever/never smoker) using REML analysis implemented with GCTA [Yang et al. (56)], among the European American participants of the Dental ARIC 
study {n = 4504) 



Genotyped SNPs Imputed 3 SNPs 

Exclusion niters: MAF < 0.05 MAF < 0.05, R 2h < 0.6 MAF < 0.05, R 2h < 0.3 

h(SNPs)": 656 292 2 104 905 2 131 070 

Variance d explained (SE) Variance* 1 explained (SE) Variance 11 explained (SE) 



Moderate CP 



Only SNPs considered 


0.057 (0.13) 


0.097 (0.11) 


0.099 (0.11) 


+ 10 PCs for population structure 


0.006 (0.14) 


0.068 (0.12) 


0.069 (0.12) 


+ 10 PCs, sex 


0.000 (0.14) 


0.048 (0.12) 


0.049 (0.12) 


+ 10 PCs, age 


0.008 (0.14) 


0.082 (0.12) 


0.083 (0.12) 


+ 10 PCs, sex, age 


0.000 (0.14) 


0.066 (0.12) 


0.067 (0.12) 


+ 10 PCs, sex, age, [G] x [smoking history]" 


0.358 (0.26) 


0.405 (0.23) 


0.391 (0.24) 


[G] x [E] term (LRT X 2 ) P 


0.1 


0.04 


0.04 


Severe CP 








Only SNPs considered 


0.298 (0.18) 


0.214 (0.15) 


0.212 (0.15) 


+ 10 PCs for population structure 


0.175 (0.19) 


0.083 (0.16) 


0.080 (0.16) 


+ 10 PCs, sex 


0.221 (0.19) 


0.110 (0.16) 


0.109 (0.16) 


+ 10 PCs, age 


0.171 (0.19) 


0.097 (0.16) 


0.094 (0.16) 


+ 10 PCs, sex, age 


0.220 (0.19) 


0.127 (0.16) 


0.127 (0.16) 


+ 10 PCs, sex, age, [G] x [smoking history]" 


0.515 (0.35) 


0.425 (0.31) 


0.418 (0.31) 


[G] x [E] term (LRT X 2 ) P 


0.04 


0.06 


0.06 



GCTA, genome-wide complex trait analysis; MAF, minor allele frequency; SE, standard error; PCs, principal components; LRT, likelihood ratio test comparing 
models, including the [G] x [E] term against 'reduced' models without the interaction term. 
a Imputed using HapMap II-CEU. 
b Imputation quality score. 

"Number of SNPs that were used to estimate the GRM after exclusions, among the study participants as a first step in the GCTA prior to conducting REML. 
d Adjusted to the prevalence of CP in the Dental ARIC cohort, moderate CP — 0.43 and severe CP — 0.17. 

"Smoking histoiy was defined as a binary variable, where 0: never smoker (47% of participants) and 1: ever smoker (53% of participants). 



It is well established that 'environmental' and behavioral 
factors are important in CP, with the major lifestyle risk 
factor being smoking (26,27). Although smoking is an unlike- 
ly confounder of genetic effects on CP risk, synergistic effects 
are likely. Tomar and Asma (28) used 1988-1994 NHANES 
data to estimate that up to 52% of periodontitis cases in the 
USA were attributable to current or former smoking, 
whereas among current smokers, 75% of periodontitis cases 
were attributable to smoking. Although our study was under- 
powered to efficiently examine G x E interactions, we 
detected one significant interaction of NUAK1 with smoking 
that was not replicated in the Health ABC sample. Although 
some level of misclassification in the environmental exposure 
cannot be ruled out, we support that further investigation of 
candidate gene interactions with smoking, possibly involving 
more sensitive measures such as serum cotinine levels and 
smoking pack years, may lead to the discovery of novel risk 
loci for CP. The overall strong effect of smoking was con- 
firmed in our data, where the inclusion of a smoking inter- 
action term with the genetic heritable variance component 
improved significantly the phenotypic variance explained in 
both periodontitis traits, reaching up to 52% for severe peri- 
odontitis. Although these results were imprecise and should 
be interpreted with caution, they provide support for a 
strong interaction of smoking with common variation that 
should be explored in detail in future genetic and 
mechanistic studies. 

The lack of an overlap of risk loci for the moderate and 
severe disease traits is not surprising. Although these results 
are preliminary, the lack of overlap is consistent with our 



approach of examining the two CP diagnoses separately: mod- 
erate and severe CP are considered largely distinct entities, 
rather than disease progression stages. Since the first 
population-based studies of periodontal disease natural 
history, the most severe forms of disease were distinguished 
into moderate and mild forms; those in the top 15-18% of 
the distribution have different rates of clinical progression, mi- 
crobial composition, host response and molecular (inflamma- 
tory) characteristics (29-31). Interestingly, the heritability 
estimates that we obtained for the severe CP trait in this 
GWAS were consistently higher when compared with those 
obtained for the moderate trait. Also, from a statistical stand- 
point, the lack of overlap is not surprising because small sto- 
chastic variations can have a big impact on the tails of the test 
statistic distribution. However, because these traits share a 
common pathogenetic underpinning, and in our analyses we 
used the same controls for both contrasts, some overlap in 
GWA signals should be anticipated. The discovery of the 
same nervous system signaling pathway as the most signifi- 
cantly associated with both traits may indeed be an indication 
of a shared pathogenetic framework. Explorations at lower 
P-value thresholds may reveal more 'good signals' and 
common risk loci. 

The heritability estimates attributable to common variation 
that we obtained are lower when compared with previous esti- 
mates reported by Michalowicz et al. (11,32) from studies 
among twins. A more recent study used an animal model to 
estimate 35% heritability in alveolar bone loss, after adjust- 
ment for age and sex (33). As noted by Yang et al. (34), 
heritability estimates obtained via the use of GWAS SNPs 
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are likely underestimates of the true heritability due to 
common variation because of incomplete LD between causal 
variants and available SNPs and the fact that causal variants 
may have lower MAF when compared with all other SNPs 
examined. We anticipate that the precision and validity of 
these estimates will improve as higher density imputations 
and whole genome sequencing are becoming more common 
and larger GWAS samples more feasible. 

A recent GWAS of generalized aggressive periodontitis 
(gAgP) among a sample of whites of European descent iden- 
tified associations with a susceptibility locus on 9p34.3 intron- 
ic to the glycosyltransferase 6 domain containing 1 (GLT6D1) 
gene and a shared susceptibility locus on 9p21 .3 for both gAgP 
and CHD (35,36). However, gAgP is a rare form of periodon- 
titis, found in less than 1% of adults and is a distinct entity 
from CP. Noteworthy, none of the previously reported CP 
risk candidate gene polymorphisms met nominal statistical 
significance criteria in our study. This finding requires 
further study, nevertheless it may be reflective of the unavoid- 
able systematic biases that may affect small sample size case- 
control candidate gene studies; in this respect, the 'agnostic' 
genome-wide approach that we employed using two 
population-based samples constitutes an improvement. 

Our examination of two dichotomous disease traits, includ- 
ing shared 'healthy' controls in both cohorts can be considered 
as one of the study limitations. Examinations of quantitative 
traits such as extent of attachment loss, probing depth and/or 
tooth loss that capture a continuum in disease severity might 
offer a more powerful approach when conducting GWAS; 
however, we suggest that using current diagnostic criteria 
for CP is a valuable first step in exploring the genetic basis 
of the disease. We are aware of the ongoing debate on CP 
definition and classification (37,38). Although the CDC/AAP 
classification is the consensus taxonomy implemented in epi- 
demiologic studies and surveillance, it has several limitations. 
First, the case definitions include a reversible clinical marker 
(probing depth) and second, disease ascertainment is influ- 
enced by tooth loss (sites with severe disease are lost and 
unobservable). Although the latter issue is common to all 
CP classifications, improvements have been suggested. For 
example, Offenbacher et al. (31) introduced a refined CP clas- 
sification characterizing the disease's biological (versus 
clinical-only) phenotype, a feature that may be advantageous 
in the exploration of genetic effects. Future research directions 
may include the interrogation of CP 'endophenotypes' (39), 
pleiotropic effects (40) in the context of phenome-wide asso- 
ciation studies (41) and composite phenotypes involving 
clinical, molecular and oral microbiome characteristics. Such 
investigations may help shed light on the complex CP 
etiology, pathogenesis and variable clinical manifestation, as 
well as clarify possible shared genetic underpinnings with 
other systemic conditions. 

In summary, we present the results of the first GWAS of CP 
among EA. Although none of the reported loci reached 
genome-wide significance levels, our data provide support 
for further investigation for the role of several loci in the 
risk of CP. Acknowledging that interpretation of these 
results should be made with caution, we support that the sug- 
gestive evidence on the six prioritized loci, the heritability 
estimates, including the significant interaction with smoking, 



the associated canonical signaling pathways involving the 
nervous system and immunity and the full set of GWAS 
results, may serve as a rich hypothesis-generating resource. 
Future research, including larger GWA, replication and fine- 
mapping studies, as well as mechanistic and experimental 
investigations will be required to further our understanding 
of the pathogenesis of the disease and may lead to novel pre- 
ventive and therapeutic approaches. 

MATERIALS AND METHODS 

Study population 

Our discovery GWA analysis was performed in a sample of 
European American participants of the ARIC study (42) age 
53-74. The ARIC is a prospective cohort study of atheroscler- 
osis, cardiovascular disease risk factors and outcomes that 
enrolled 15 792 community-dwelling residents in 4 US com- 
munities (Jackson, MS; Washington County, MD; suburban 
Minneapolis, MN; and Forsyth County, NC) between 1987 
and 1989. The Dental ARIC is a National Institute of Dental 
and Craniofacial Research-funded ancillary study that took 
place during the fourth ARIC visit (1996-1998) and included 
complete oral-dental examinations in a subset (« = 60 1 7) of 
dentate ARIC participants. The Health ABC Study is a 
National Institute on Aging-sponsored longitudinal investiga- 
tion examining factors that contribute to incident disability 
and functional decline of healthier older persons, with a par- 
ticular emphasis on changes in body composition in old age 
(43). Between April 1997 and June 1998, the Health ABC 
study had recruited 3075 70-79-year-old well-functioning 
community-dwelling adults. Study participants were recruited 
from a random sample of Medicare beneficiaries in Pittsburgh, 
Pennsylvania and Memphis, Tennessee. As part of the study 
year 2 and 3 follow-up clinical visits (1998-2000), a total 
of 1133 EA and African American participants received 
complete dental and periodontal examinations (44,45). 

Phenotype measurement and definition 

As part of the Dental ARIC, ancillary study participants under- 
went detailed oral-periodontal examinations that recorded the 
number of missing teeth, probing depths, attachment loss mea- 
surements and bleeding upon probing at six sites per tooth, 
including third molars. Dental ARIC clinical examiners were 
trained and calibrated against a standard examiner, with 
kappas indicating excellent to outstanding level of agreement 
(46). During the Health ABC oral examinations, trained and 
calibrated examiners obtained clinical plaque index, gingival 
index, PD and CAL measurements. An a priori minimum 
level of 90% agreement on all measures was set as a bench- 
mark and was achieved by all examiners (45). We used the 
CDC and AAP consensus three-level classification for the 
disease trait definition in both cohorts. The CDC/AAP tax- 
onomy uses CAL and PD criteria to define three CP categories 
as, healthy-mild (n = 1864), moderate (n = 1961) and severe 
(n = 785) (47). Based on this definition, we created two 
dichotomous traits: severe CP (severe versus healthy) and 
moderate CP (moderate versus healthy). To fully describe 
the clinical features of these traits, we present the distribution 
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of the following clinical periodontal measures, overall and 
stratified by periodontitis diagnosis in each cohort: plaque 
score, gingival index, extent of bleeding on probing, mean 
probing depth (mm), mean attachment loss (mm), extent of 
probing depth (>4 mm), extent of attachment loss (>3 mm) 
and number of natural teeth. 

Genotyping, imputation and quality control 

In the ARIC study, population DNA was extracted from blood 
samples drawn from an antecubital vein into tubes containing 
serum separator gel. Blood samples were analyzed at a central 
ARIC laboratory in Houston, TX, USA. Genotyping was per- 
formed using the Affymetrix Genome-Wide Human SNP 
Array 6.0 chip that offers 906 600 markers for SNPs. The 
quality control procedures included initial blind duplicate 
genotyping and identification/flagging of SNPs with k < 
0.95 and reconciliation of unintentional duplicate samples 
(17 duplicates and 1 triplicate). Imputation to 2.5 million 
markers was performed using 669 450 SNPs and the MACH 
program version 1.0.16 (48), based on HapMap Phase II 
CEU build 36. The SNPs used for imputation were selected 
from 839 048 autosomal SNPs restricted to those with 
MAF > 0.01 (129 543 excluded), Hardy- Weinberg equilib- 
rium (HWE) P>10" 5 (12 432 excluded) and call rate 
>95% (1693 excluded). We used the following criteria for ex- 
clusion of SNPs from further analyses: quality score < 0.8 and 
missing data rate > 10% after imputation and MAF of <5%. 

In the Health ABC Study, population genomic DNA was 
extracted from buffy coat collected using PUREGENE® 
DNA Purification Kit during the baseline examination. Geno- 
typing was performed by the Center for Inherited Disease Re- 
search using the Illumina Human lM-Duo BeadChip system. 
Illumina BeadStudio was used to call genotypes. Samples 
were excluded from the dataset for the reasons of sample 
failure, sex mismatch and first-degree relative of an included 
individual based on genotype data. Genotyping was successful 
for 1 151 215 SNPs in 1663 European American participants. 
Imputation for autosomal SNPs was done using MACH 
version 1.0.16. SNPs with MAF > 1%, call rate > 97% and 
HWE P > 10~ 6 were used for imputation using HapMap 
Phase II CEU build 36 resulting in a final set of 2 543 887 
SNPs. 

Population stratification 

To obtain estimates of relatedness and population stratification 
in ARIC, a subset of 85 947 'high quality' SNPs was selected. 
These SNPs met the following criteria: MAF > 0.1, call 
rate > 99.5%, HWE P > 10" 3 , autosomal, with annotation 
in the platform annotation file, not labeled 'AFFX' or 
'chromosome 0' and not monomorphic. Using these SNPs, 
identity-by-state (IBS) allele sharing distance (DST values) 
was computed using PLINK (49), as such: DST = IBS dis- 
tance (IBS 2 + 0.5 x IBSxVCn SNP pairs). First-degree relative 
status was assigned to pairs of individuals with DST > 0.8, 
and second-degree relatives were considered those with 
0.763 < DST < 0.8. There were 380 pairs of first degree and 
207 pairs of second-degree relatives identified among the 
ARIC white participants of European descent. To minimize 



exclusions, related pairs were broken by iterative selection 
of individuals with most relatives using a custom-written 
program. 

Population stratification was further evaluated with princi- 
pal component (PC) analysis using the EIGENSTRAT 
program (50). The previously chosen set of SNPs was used 
for the computation of 10 PCs. Genetic outliers were consid- 
ered those that were farther than 8 standard deviations away 
from any of 10 PCs over 10 runs of PC computation. Based 
on DST and PC criteria, there were 716 subjects flagged for 
removal from the analysis [206 as genetic outliers based on 
PCs and 16 based on average DST values ('too little IBS 
sharing' with the rest of the sample), 351 first-degree relatives 
and 143 second-degree relatives. All but 10 second-degree 
relatives (whose relatives were excluded as genetic outliers) 
were reentered in the dataset and were assigned PCs. After 
exclusion of 364 individuals (4%), there were 9349 European 
Americans who were included in the GWA analysis, and 
of those, 4504 had periodontal phenotype data available as 
dental ARIC participants. 

Analytical strategy 

The association between SNPs and the two disease traits 
(severe and moderate CP) was tested using logistic regression 
models assuming log-additive allelic effects, adjusting for age, 
sex, examination center and ancestry (10 first PCs). Per allele 
ORs and 95% CIs and associated P-values were estimated for 
each CP trait. Consistent with the current trends in reporting of 
GWAS (15), we have made available the results of the entire 
set of SNPs that we investigated. Because no confounding of 
the association between SNPs and the examined traits is 
expected by CP risk factors, the main analysis models were 
not adjusted for smoking or diabetic status. However, in the 
Supplementary Material, we present the results (effect esti- 
mates and percentage of change in estimate) of additional ex- 
ploratory analyses, adjusting for smoking status and diabetes. 
In the ARIC 'discovery' cohort, we applied a multiple-test 
correction assuming 1 million independent tests, resulting to 
a genome-wide significance level of P < 5 x 10~ 8 . All 
genetic analyses were performed with the ProbABEL software 
(51). An a priori threshold of P < 5 x 10~ 6 was set for priori- 
tizing SNPs for further investigation and replication in the 
Health ABC Study. To inspect for any substantial differences 
in the effects of prioritized SNPs between males and females 
in ARIC, we conducted additional exploratory analyses strati- 
fied by sex. The departure from between-sexes homogeneity 
was assessed by inspection of non-overlapping CIs. To 
provide a more comprehensive view of the potential role of 
these SNPs in periodontitis-related traits, we obtained and em- 
pirically examined their effect size, direction, and P-values for 
four additional 'high periodontal pathogen colonization' traits 
that were available for a subset of participants in the European 
American Dental ARIC cohort (n = 1020) and have been 
previously studied in the context of a GWAS of the 
oral-periodontal microbiome (17). 

Post-analysis procedures included the generation of quan- 
tile-quantile (Q-Q) and Manhattan plots and detailed annota- 
tions of gene context. We also examined the prioritized SNPs 
association with gene expression (as expression quantitative 
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trait loci) using the SCAN (http://www.scandb.org) database 
and reported the P-values associated with gene expression 
on LCLs. Furthermore, P-values and effect estimates [OR 
(95% CI)] of the prioritized variants (one 'top' SNP per prior- 
itized locus, as determined by the lowest P-value) in the 
Health ABC Study were obtained and examined for nominal 
significance, as well as for effect direction and magnitude con- 
cordance. 'Replication' was determined as concordant effect 
direction and nominal significance (P < 0.05) in the Health 
ABC Study. Effect estimates of SNPs that did not show 
substantial heterogeneity and 'generalized' to the Health 
ABC Study were combined in inverse-variance weighted 
meta-analysis to produce pooled (summary) ORs. 

The prioritized SNPs were annotated using WGA Viewer 
ver. 1.261 (52) and Snipper ver. 1.2 (http://csg.sph.umich.edu/ 
boehnke/snipper/), and regions of interest were viewed using 
LocusZoom ver. 1.1 (53). We used PolyPhen-2 for the predic- 
tion of potentially damaging missense changes (54) and 
METAL (55) for the summarization and meta-analysis of esti- 
mates derived from the two cohorts. We used additional online 
resources of the National Center for Biotechnology Informa- 
tion (NCBI http://www.ncbi.nlm.nih.gov/). Reporting of 
genes was based on the 'HUGO Gene Nomenclature' 
naming convention (http://www.genenames.org). The full 
names and genomic locations of the reported genes are pre- 
sented in the Supplementary Material, Table S8. 

The estimation of heritable variance (h 2 ) in the two peri- 
odontitis traits explained by all available SNPs in the 
present GWAS was performed with the use of GCTA (56). 
GCTA is based on a two-step method, where in the first 
step, all available SNPs are used to estimate a genetic relation- 
ship matrix (GRM) among all study participants. Subsequent- 
ly, the GRM is used in restricted maximum likelihood 
(REML) analyses to estimate the proportion of variance 
explained by all SNPs. Adjustment for population stratification 
and the inclusion of additional covariates, including gene-en- 
vironment interaction (G x E) terms are feasible in the REML 
step. Because the variance explained may be influenced by the 
unknown LD structure of available SNPs with causal variants, 
we examined three different sets of SNPs for the computation 
of the GRM: genotyped SNPs with MAF > 0.05, as well as 
combinations of imputed SNPs with MAF > 0.05 and imput- 
ation quality score (R 2 ) > 0.3 and >0.6. The maximum 
number of SNPs used was 2 131 070. Additionally, the inclu- 
sion of an interaction term with smoking was examined using 
a binary definition of smoking history (ever/never smoker) and 
inspecting the variance explained and a likelihood ratio test 
P-value (using a P < 0.2 criterion, as the examination of inter- 
action effects is based on de facto underpowered analyses). 

For the conduct of pathway analyses, we first identified a set 
of 'independent' signals providing nominal evidence associ- 
ation (using an arbitrary P-value criterion of <10 -2 ) for 
each of the two disease traits using LD pruning. This step 
was implemented with PLINK, filtering SNPs that were corre- 
lated (r 2 > 0.1) with the index SNP (lowest P-value) in each 
locus and within a distance of 400 Kb. Next, the closest or har- 
boring gene was assigned to each locus, if one was found 
within 100 Kb of the index SNP. Thus, 2 sets of 2328 loci 
and genes nominally associated with moderate CP and 2425 



nominally associated with severe CP were carried forward to 
pathway analysis that was conducted using IPA (Ingenuity® 
Systems, www.ingenuity.com, Redwood City, CA, USA). As 
a sensitivity analysis, we explored the use of a more stringent 
criterion (P < 10~ 3 ) to carry forward association signals in 
pathway analyses that resulted in 2 additional sets of loci, enu- 
merating 361 genes for each disease trait. 

Consistent with the current understanding of pathogenesis 
of periodontitis that entails dysbiotic host-pathogen interac- 
tions, detrimental immune response (1,8) and the emerging 
role of the nervous system (24,57), we investigated the enrich- 
ment of ingenuity canonical signaling pathways in the follow- 
ing categories: cellular immune response, cytokine signaling, 
humoral immune response, neurotransmitters and other 
nervous system signaling and pathogen-influenced signaling. 
There were 147 canonical pathways listed in these categories 
(http :// genomewide .net/public/ aric/dental/periodontitis/ CDC/ 
CP_candidate_canonical_pathways.xls). To serve as control, 
we examined the enrichment of the IPA metabolic group of 
pathways (n = 126). The association between pathways and 
the data was determined with Fisher's exact test and a 
Bonferroni-corrected P-value threshold for multiple pathways 
tested (critical P < 3.4 x 10 4 ). For each pathway, we also 
present the ratio of the number of molecules in the data that 
mapped to each canonical pathway over the number of 
molecules in the pathway. 

SUPPLEMENTARY MATERIAL 

Supplementary Material is available at HMG online. 
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